Joint Sound Source Separation and Speaker Recognition
نویسندگان
چکیده
Non-negative Matrix Factorization (NMF) has already been applied to learn speaker characterizations from single or nonsimultaneous speech for speaker recognition applications. It is also known for its good performance in (blind) source separation for simultaneous speech. This paper explains how NMF can be used to jointly solve the two problems in a multichannel speaker recognizer for simultaneous speech. It is shown how state-of-the-art multichannel NMF for blind source separation can be easily extended to incorporate speaker recognition. Experiments on the CHiME corpus show that this method outperforms the sequential approach of first applying source separation, followed by speaker recognition that uses state-of-the-art i-vector techniques.
منابع مشابه
Advances in audio source seperation and multisource audio content retrieval
Audio source separation aims to extract the signals of individual sound sources from a given recording. In this paper, we review three recent advances which improve the robustness of source separation in real-world challenging scenarios and enable its use for multisource content retrieval tasks, such as automatic speech recognition (ASR) or acoustic event detection (AED) in noisy environments. ...
متن کاملRobotic Sound Source Separation using Independent Vector Analysis
Beside haptic and vision, mobile robotic platforms are equipped with audition in order to autonomously navigate and interact with their environment. Speaker and speech recognition as well as the recognition of different kind of sounds are vital tasks for human robot interaction. In situations where more than one sound source is active, the mixture has to be separated before being passed to the ...
متن کاملComparison of a joint iterative method for multiple speaker identification with sequential blind source separation and speaker identification
An individual’s voice is hardly ever heard in complete isolation. More commonly, it occurs simultaneously along with other interfering sounds, including those of other overlapping voices. Though there has been a great deal of progress in automatic speaker identification, the majority of past work has focused on the case of non-overlapping speakers. Many of these systems are easily confounded by...
متن کاملCombining Independent Component Analysis and Sound Stream Segregation
This paper reports the issues and results of AI Challenge: \Understanding Three Simultaneous Speeches". First, the issues of the Challenge are revisited. We emphasis the importance of information fusion of various attributes of speeches (sounds) in separating speeches from a mixture of sounds. This emphasis is supported by comparing two methods of speech separation; computational auditory scene...
متن کاملIntegrating Monaural and Binaural Cues for Sound Localization and Segregation in Reverberant Environments
The problem of segregating a sound source of interest from an acoustic background has been extensively studied due to applications in hearing prostheses, robust speech/speaker recognition and audio information retrieval. Computational auditory scene analysis (CASA) approaches the segregation problem by utilizing grouping cues involved in the perceptual organization of sound by human listeners. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016